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ABSTRACT. We prove a vector-valued non-homogeneous Tb theorem on certain 
quasimetric spaces equipped with what we call an upper doubling measure. Es- 
sentially, we merge recent techniques from the domain and range side of things, 
achieving a Tb theorem which is quite general with respect to both of them. 

1. Introduction 

In the seminal paper |NTV03j by Nazarov, Treil and Volberg, it was already 
indicated that it should be possible to prove some version of their (Euclidean) 
non-homogeneous Tb theorem also in a more abstract metric space setting, just 
like the well-established homogeneous theory in this generality [DJS85], [Chr90J. 



A recent paper [HM09] by the author and Tuomas Hytonen shows that this is 
indeed the case: a non-homogeneous Tb theorem in the general framework of 
quasimetric spaces equipped with an upper doubling measure (this is a class 
of measures that encompasses both the power bounded measures, and also, the 
more classical doubling measures) was proved. See also [VW09a|. 

It is natural to seek to extend the generality in the range too (instead of con- 
sidering only scalar valued operators). These type of developments, just like the 
regular scalar valued Tb theorems, have a long history (for a discussion of the 
origins of the vector-valued Tb theory consult e.g. [H yt09b[ ). In the very recent 
work [MP10], a UMD-valued Tl theorem is established in metric spaces - how- 
ever, only with Ahlfors-regular measures fi (i.e. fx(B(x,r)) ~ r m ). This assump- 
tion seems to be necessary for their method of proof based on rearrangements 
of dyadic cubes. In [ Hyt09bfl a vector-valued non-homogeneous Tb theorem is 



proved in the case of the domain being W 1 and the relevant measure // being 
power bounded (that is, n(B(x, r)) < Cr m ). 

The methods of |Hyt09bI are already less dependent on the structure of R" 
than much of the earlier vector-valued work, thus foreshadowing the possibility 



2000 Mathematics Subject Classification. 42B20 (Primary); 30L99, 46B09, 46E40, 60D05, 60G46 
(Secondary). 

Key words and phrases. Calderon-Zygmund operator, non-doubling measure, probabilistic con- 
structions in metric spaces, martingale difference, paraproduct. 

The author is supported by the Academy of Finland through the project "LP methods in har- 
monic analysis". The paper is part of the author's doctoral thesis project written under the su- 
pervision of Academy research fellow Tuomas Hytonen - the guidance of whom is gratefully 
acknowledged. 

1 



2 



HENRI MARTIKAINEN 



of extending to more general domains. The goal here is to carefully combine 
key techniques from the recent developments [HM09] and |Hyt09b] and obtain a 



proof of a non-homogeneous Tb theorem, which is simultaneously general with 
respect to the domain (a metric space), the measure (an upper doubling measure) 
and the range (a UMD Banach space). 

2. Preliminaries and the main result 

2.1. Geometrically doubling quasimetric spaces. A quasimetric space (X, p) is 
geometrically doubling if every open ball B(x,r) = {y £ X : p(y,x) < r} can 
be covered by at most iV balls of radius r/2. A basic observation is that in a 
geometrically doubling quasimetric space, a ball B(x,r) can contain the centers 
Xi of at most Na~ n disjoint balls B(xi, ar) for a E (0, 1]. Instead of working with 
what we called reqular quasimetrics in [HM09], it will be assumed for added 
convenience that p = dP for some metric d and some constant (3 > 1 (and not 
just equivalent to such a power of a metric). Then d-balls are p-balls and even the 
weak boundedness property works for both type of balls (this was a somewhat 
of an inconvenience before). This seems to be general enough to cover many 
interesting cases. 

2.2. Upper doubling measures. A Borel measure p in some quasimetric space 
(X, p) is called upper doubling if there exists a dominating function A : X x 

(0, oo) — > (0, oo) so that r X(x,r) is non-decreasing, X(x,2r) < C\\(x,r) and 
p(B(x,r)) < \(x,r) for all x £ X and r > 0. The number d := log 2 C7 A can be 
thought of as (an upper bound for) a dimension of the measure /i, and it will play 
a similar role as the quantity denoted by the same symbol in IINTV031 . 

2.3. Standard kernels and Calderon-Zygmund operators. Define A = {(x,x) : 
x E X}. A standard kernel is a mapping K : X 2 \ A — > C for which we have for 
some a > and B, C < oo that 

\K(x,y)\<Bmm(— -, — -), x^y, 

\\{x,p(x,y)) \{y,p(x,y))J 



and 



\K( X ,y) - K M )\ < B P (,,V) > C PM ). 



The smallest admissible B will be denoted by ||iT||c^ a ; it is understood that the 
parameter C has been fixed, and it will not be indicated explicitly in this notation. 

Let T : / i— > Tf be a linear operator acting on some functions / (which we shall 
specify in more detail later). It is called a Calderon-Zygmund operator with 
kernel K if 

Tf(x)= [ K(x,y)f{y)dp(y) 
Jx 
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for x outside the support of /. 

2.4. Accretivity. A function b G L°° (/i) is called accretive if Re b > a > almost 
everywhere. We can also make do with the following weaker form of accretivity: 

| J A b dp:\ > afi(A) for all Borel sets A which satisfy the condition that B c A c CB 
for some ball B = B(A), where C is some large constant which depends on the 
quasimetric p. (One can e.g. take C = 500 if dealing with metrics). 

2.5. Weak boundedness property. An operator T is said to satisfy the weak 
boundedness property if | (Txb, Xb) I < Afi(AB) for all balls B and for some fixed 
constants A > and A > 1. Here (• , •) is the bilinear duality (f,g) = J fg dp,. Let 
us denote the smallest admissible constant above by ||T||vi/bp a - 

In the Tb theorem, the weak boundedness property is demanded from the op- 
erator M b2 TM bl , where b\ and 6 2 are accretive functions and M b : f i-)- bf. 

2.6. BMO and RBMO. We say that / G £^0) belongs to BMO£(//), if for any 
ball B c X there exists a constant fs such that 



(J B \f-f B \ p df?) 1/P <Lfi(KB) 



i/p 



where the constant L does not depend on B. 

Let q > 1. A function / G L\ oc (ij) belongs to RBMOdu) if there exists a constant 
L, and for every ball B, a constant f B , such that one has 



\f-f B \dti<Ln( g B), 

B 

and, whenever BcBi are two balls, 

\fB-f Bl \ <L(l+ I wdnixj). 

V J2B 1 \B X{CB,P{X,C B )) / 

We do not demand that f B be the average (/) B = J B f dp,, and this is actually 
important in the RBMO(/i)-condition. The useful thing here is that the space 
RBMO(/i) is independent of the choice of parameter g > 1 and satisfies the John- 
Nirenberg inequality. For these results in our setting, see |Hyt09a[. The norms in 



these spaces are defined in the obvious way as the best constant L. 

2.7. UMD Banach spaces. A Banach space Y is said to satisfy the UMD property 
if there holds that 



n 

< c\\ 



/^kdk ^ 1 !! ? 

^— ' LP(n,Y) II \\lp(q,y) 
k=i fc=i 



whenever (dk)^ =1 is a martingale difference sequence in L P (Q, Y) and = ±1 are 
constants. This property does not depend on the parameter 1 < p < oo in any 
way. 
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2.8. Vinogradov notation and implicit constants. The notation / < g is used 
synonymously with f < Gg for some constant G. We also use f r ^ j giif<g< 
/. The dependence on the various parameters should be somewhat clear, but 
basically G may depend on the various constants of the avove definitions, and 
on an auxiliary parameter r (which is eventually fixed to depend on the above 
parameters only). 

We now state our main theorem. 

2.9. Theorem. Let (X, p) be a geometrically doubling quasimetric space so that p = dP 
for some metric d and f3 > 1, and assume that this space is equipped with an upper dou- 
bling measure p. Let Y be a UMD space and 1 < p < oo. Let T be an L P (X, Y)-bounded 
Colder vn-Zygmund operator with a standard kernel K, bi and b 2 be two accretive func- 
tions, a > and k,A>1. Then 

< \\Tbi\\ B Moi(fi) + \\T*b2 1 1 bmo\{h) + \\M b2 TM bl \\ WB p A + \\K\\cz a , 
where the first three terms on the right are in turn dominated by ||T||. Here, of course, 

11^11 = \\T \\LP(X,Y)-+LP(X,Y)- 

Note that it suffices to prove the theorem in the case (5 = 1, that is, we are 
working in an honest metric space (X, d) from now on. We give an example 
before proceeding with the proof of the theorem. 

2.10. Example. In [HM09, chapter 12] we gave an example related to the paper 
ITVW09bll , and there the application was in a situation where the measure in ques- 



tion was genuinely upper doubling (the doubling theory or the theory of power 
bounded measures would not have sufficed), and the space was a quasimetric 
one (so it really was non-homogeneous theory on metric spaces). 

Now we give an example which is actually in the homogeneous situation, but 
as the domain is a metric space and the range is a general UMD space, this seems 
not to follow from the previous works. Also, it goes to show that it is convenient 
to get this doubling theory as a byproduct of the upper doubling theory. 

The example we have in mind is the boundedness of the classical Cauchy- 
Szego projection as a UMD-valued operator (this question was asked by Tao Mei 
through a private communication with Tuomas Hytonen, and Mei had solved 
this question in the special case when the range space Y is a so-called non- 
commutative L p space). The setting is the Heisenberg group HI", which is identi- 
fied with ]R 2n+1 , and is a non-abelian group where the group operation is given 
by 

n 

x ■ y = (sci + yt, . . . , x 2n + y 2n , x 2n +i + V2n+i - 2 y^ y (xjy j+n - x j+n y n )). 
The metric is given by 

d(x,y) = HaT 1 ■ y\\ 
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where 

IMI = (II ( x i> • • • > x 2n) I Ik 2 « + ^L+i) 1 ^ 4 - 

One can also write x = [£, t] G H" = C n x K. We use the Haar measure for HP (this 
is just the Euclidean Lebesgue measure d^dt on C n x R). Now A(x, r) = £y 2n+2 
for some appropriate constant C. 

Using the above notation x = [£,t], let K{x) = C{t + i|C|)~ n_1 - Set = 
K(y~ x ■ x) for x ^ y (i.e. ■ i ^ 0). The Cauchy-Szego projection C is an 
L 2 -bounded operator of the form 

Cf(x)= [ K(x,y)f(y)dy. 
Jm n 

See e.g. [Ste93J for a more exhaustive treatment of the Cauchy-Szego projection. 

Clearly the standard kernel estimates known for K are precisely the same as 
demanded by our theory with our chosen A. Thus, as C is a Calderon-Zygmund 
operator which is bounded as a scalar-valued operator (and thus satisfies the 
BMO conditions with e.g. b\ = b 2 = 1 and the weak boundedness property), we 
have by our above Tb (or Tl in this case) theorem that T is a bounded operator 
L p (W\ Y) -»■ L p (U n , Y) for every UMD space Y and for every index p e (1, oo). 



3. JOHN-NlRENBERG THEOREM FOR Tb t 

In IIHM09I1 it was assumed that Tb x ,T*b 2 G BMO 2 . (//) (and this is natural enough 
for the L 2 theory) so there it was not necessary to deal with the contents of this 
chapter. However, now that we are directly doing LP theory, it seems to be 
more important to prove that Tbi G BMO£(/z) Th G RBMO(^) =^ Tbi G 
BMO'(/i) for all 1 < q < oo (and similarly for T*b 2 ). Indeed, otherwise we would 
need to assume a priori that Tbi,T*b 2 G (~) 1<q<OQ BMO q K (fi). This reduction is 
known in the Euclidean setting with a power bounded measure (see IINTV03I1 ). 
We now work out the details in our setting. However, only one key lemma really 
requires some modifications from the proof found in [NTV03J, and so we only 



sketch the other parts of the argument. See also |Hyt09a], where the details of 
the RBMO(;u) theory, especially the John-Nirenberg inequality, are worked out 
in our setting. 

3.1. Lemma. Consider some fixed ball B = B(c B , r B ). There exists R B G [r B , 1.2r B ] so 
that 

H({x G X : R B - r B s < d(x, c B ) < R B + r B s}) < sfi(B(c B , 3r B )) 
for all s G [0, 1.5]. 



Proof See IINTV03L p. 184]. 
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3.2. Lemma. If B — B(c B , r B ) is a ball and R B is a related regularized radius as in the 
previous lemma, then it holds that 

I I \K(x,y)\dfi(y)dfi(x) < pt(B(c B: R B )f/ 2 fi(B(c B ,3r B ))^ 2 

JB(c B ,R B ) JB(c B ,3r B )\B(c B ,R B ) 

< n(B(c B ,3r B )). 

Proof. Consider /(a;) = J B (c B ,3r B )\B(c B ,R B ) \K(x, y)\ d/i(y), x G B(c B ,R B ). Fix x G 
B(c B ,R B ) for the moment and note that we have for all y G B(c B) 3r B )\B(c B , R B ) 
that d(x, y) < R B + 3r B < A.2r B < 5r B and d(x, y) > d(y, c B ) — d(x, c B ) > R B — 
d(x, c B ). We temporarily set h = R B — d(x, c B ) for this fixed x and estimate 

dfi(y) 



< 



I h<d(x ,y) <5r B 

£ E 

l<i<log 2 (10r B /A) 

* E 

l<i<log 2 (10r B /A) 

< log(10r B //i) 

lOr B 



X(x,d(x,y)) 



d/i(y) 



2i- 1 h<d(x,y)<2ih ^( X i d(x, y)) 

fx(B(x,2^h)) 
X(x, 2i" 1 /i) 



= log(- 



>R B — d(x, c B ) 
This implies through Holder's inequality that 



f(x)d(M(x) <^B{c B ,R B )) 1 ' 2 

'B{c B ,R B ) 

We then continue to note that 



B(c B ,R B ) 



log 



B(c B ,R B ) 

!0r B 



log 



10r B 



equals 



/^{x G B(c B ,R B ) : 
which in turn equals 



R B - d(x, c B ) 
10r B 



log 



R B - d(x, c B ) 
J J d/i{x) 

})dt, 



• \ 
dfi(x)J 



1/2 



R B - d(x,c B )J . 



> t 



f 

Jo 



H({x : R B - 10r B e ^ < d(x, c B ) < R B }) dt = 
+ 



log 



V R 



B 



fJ>{B(c B , Rb)) 



/j,({x : R B - 10r B e ^ < d(x, c B ) < R B }) dt. 

' [log(Wr B /R B )P 

Note that f °° dt = 2 and use the previous lemma with s = lOe^^ < R B /r B < 
1.2 < 1.5 for t > [\og(10r B /R B )} 2 to get that 



/ 



[\og(Wr B /R B )]2 



/i({x : R B - 10r B e-^ < d(x, c B ) < R B }) dt < u,(B(c B , 3r B )). 
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This yields the claim. □ 

3.3. Theorem. Under the assumptions ofTheorem \2.9\ there holds that Tbi e RBMO(n), 
especially Tbi e f] 1<q<oo BMO q Ki (fi) for any Ki > 1. 

Proof. It suffices to prove that for every ball B the function T(xiob^i) satisfies the 
defining properties of the RBMO(/i) space for all the balls that are subset of B, 
and in such a way that the RBMO norm does not depend on B. To see that this 
suffices, note that \Tbi — T(xios^i)| < 1 onB for all balls B. The hardest part of 
the remaining proof consists of proving that 

\T(x2Bb\)\dn < n{r]B) 



B 



for r] = max(2/€, 2A, 3) (the rest of the proof unfolds naturally). This inequality 
follows from duality using the assumption Tb\ G BMO'(/i), the weak bounded- 
ness property, the previous lemma and the fact that &i is accretive. These details 
follow as in IINTV03L chapter 2]. □ 

4. Random dyadic systems and good/bad cubes 



One feature of the proof in | Hyt09b [ is that one basically takes all the cubes to 
be good in the various summations - this is in contrast with the proof in [HM09J 
where things were usually summed so that the bigger cubes are arbitrary but the 
smaller cubes from the other grid were assumed to be good. This modification 
seems to be particularly useful when dealing with certain paraproducts in these 
general UMD spaces. 

This leads us to fiddle with our randomization from [HM09J quite a bit. We 
shall make the randomization so that there is no removal procedure involved 
(unlike in [HM09J) - then a certain index set may serve as a fixed reference set 
more conveniently. Such a modification will also be used in a future paper by T. 
Hytonen and A. Kairema, and the author learned about the details of this modi- 
fication from them through a private communication. 

Furthermore, we will change the definition of a good cube to be such that 
given a cube Q its change to be good does not depend on the smaller cubes R 
with £(R) < £(Q). Related to this we shall also make a minor tweak to our half- 
open cubes from [HM09J (to get a better dependence on the randomized dyadic 
points). Finally, we add a layer of artificial badness so that F(Q is good) does not 
depend on the particular choice of the cube Q. 

Let u s get to the details. Let 5 = 1/1000. We recall from IIHM09I1 (see also 
IIChr 90 1 for the original construction) that given a collection of points x k a such 



that d(x k K ,x k 3 ) > 5 k /8 for all a ^ (3 and mm a d(x,x^) < A5 k , we may define a 
certain transitive relation < v between these points, and then there exists sets Q k a 
(we call these half-open dyadic cubes) so that for every fceZ we have 



X = [jQ 



k 
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for every k 6 Z and I > k it holds that either Q k n = or c Q*, and for 
every £ > k we have 

# = u 

P:(i,P)<-p(k,a) 

This set of cubes is denoted by X> = {Qq}. Moreover, these cubes satisfy that 

d(Q*) < C £* and B(x k a ,C 1 8 k ) c Q£ for C = 10 and d = 1/100. We denote 

We now fix some large natural number k the value of which will be specified 
more carefully in the next chapter. The tweak we make to the construction of the 
above cubes is simple: we follow the construction in |HM09, chapter 4] except 
that in the proof of [HM09 , Theorem 4.4] we make the construction so that k = 0, 
k < and k > are replaced by k = k , k < k and k > k respectively. The point 
is that the original dyadic cubes of M. Christ (which may not cover the whole 
space unlike these half-open ones) have a better dependence on the centers x k a 
than the half-open cubes. After the modification, however, we have this more 
favourable dependence at least for all the cubes of generations k < k . Indeed, 
now a cube Q k a , where k < k , depends only on the centers x^ for £ > k. All the 
properties stated above remain valid, of course. 

We explain the modified randomization now. One starts by fixing once and for 
all the reference points z k satisfying d(z k , z k a) > S k for all a ^ (3 and min a d(x, z k ) < 
5 k . We also fix one relation < related to these points. We say that (k, a) and (k, /3) 
conflict if d{z k+1 , z k+1 ) < 5 k /4 for some (k + 1, 7) < (k, a) and (k + l,a)< (k, (3). 
Let I(k, a) be the set of pairs (k, 0) conflicting with (A;, a). Note that #I(k, a) < 1 
as X is geometrically doubling. We now earmark the points z k (or the indices 
(k, a)). To this end, fix some L > max( fc Q ) #I(k, a). Let k E 7L be given. We in- 
ductively tag z k by associating it with the smallest number i e {1, . . . , L} having 
the feature that no (k, (3) e I{k, a) that has already been tagged is associated with 
this number (recall that a always varies only over some countable set). 

We now associate to each (k, a) a new point x k a in a random way. First one ran- 
domly chooses i e {1, . . . , L} (uniform distribution, of course). If (k, a) happens 
to be earmarked with the number i, we set x k a = z k+1 for some (k + 1, 0) < (k, a), 
and the choice is made using uniform probability (there are only boundedly 
many indices (k + 1, 0) < (k, a)). If (k, a) is not tagged with the number i, we 
set x k a = z k+1 for some (k + 1, 0) for which it holds that d(z k , z^ +1 ) < 5 k+1 (there 
is always at least one such point available by construction). To summarize, for 
i-tagged indices we randomly choose any z^ +1 for which (k + 1, 0) < (k, a) and 
for the rest we choose some special z^ +1 which is particularly close to z k . This is 
done independently on all levels k e Z. The idea of using this tagging as a way to 
avoid the removal procedure used in [HM09J is by T. Hytonen and A. Kairema. 

The result is some new set of points x k al which readily qualify as new dyadic 
points (that is, d(x k , Xg) > 5 k /8 for all a ^ (3 and miriQ, d{x, x k a ) < A8 k (with some 
better constants even)). This is an easy consequence of the construction, and we 
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omit the details. Also evident is the fact that F(z^ +1 = x k ) > n > for some 
absolute constant 7r if(/c + l,/3) < (k,a) (this needed an extra argument with the 
randomization used in [HM09J). Now the same proof as in [HM09, Lemma 10.1] 
also gives us the same result with this modified randomization. That is, we have: 

4.1. Lemma. For any fixed x e X and fceZ, it holds 

F(x e Sgkfor some a) < t r > 

for some r? > 0. Here 5 Q k = {x : d(x, Q k ) < e£(Q k a ) and d(x, X \ Q k a ) < e£(Q k )}. 

We shall now modify the notion of goodness. Here we are given two dyadic 
systems of cubes V = {Q k a } and V = {R k a } as always. This amounts to randomly 
producing two sets of new dyadic points {x k a ) and (y k ) using the above procedure 
and then choosing (following certain established rules but somewhat arbitrarily) 
some relations <x> and <z>> related to the systems (x k ) and (y k ) respectively. In- 
deed, this information generates the families of cubes V = {Q k } and V = {i?^}. 
Set 

a 

7 := 2{a + dY 

where we recall that d : = log 2 C\ in our setting. 

4.2. Definition. We say that Q„ 6 V is geometrically V'-bad, if there exists (k — 
s, (5) 7^ (k — s, 7) for some s > r so that for some (k — 1, rf) < V t (k — s, /3) and 
(k - 1,0 <v (k - 8,7) we have d^y^ 1 ) < ^(i-^X*-*) anc j d{x k a ,y k f l ) < 
§-ykfi(i -7)(fc-s)_ otherwise Q k a is geometrically D'-good. 

Here the new feature is that with this definition the badness of a cube Q k de- 
pends only on the centers of generations £ < k of the other system. Let us then 
explain why this is still pretty close to the definition given in [HM09J. Note 
that 5 k = d^ 1 -^ 3 ■ <57fc<5(i-7)(fc-s) an d 5(1-7)- < £(i-7)r < 10 -s ( as r is f ixec i to 

be big enough). Suppose Q k a is good and s > r. We have that x k e R^ 1 C 
R k a~ s for some unique (k - l,rf) <w (k - s,/3). Now d(x k ,y k ~ 1 ) < 105 fc_1 = 
10 4 5 fc < d^ k 5^-^ k - s \ Suppose (aiming for a contradiction) that we would have 

d(x k a , X \ R k f s ) < {3/A)5^ k 5 ( - 1 -^ ( - k - s \ Then we would have for some z e X \ R k ~ s 
ihatd{x k ,z) < (3/4:)5^ k 5^^ k - s \ But then z e R k f x C R k ~ s for some (Jfe-1,0 <c 
(A; — s, 7) 7^ (A; — s, and 

d(»a,J/* _1 ) < <*(a£,z) + d(^~ 1 ) < [3/4+ io- 1 ]5 7fc 5 (1 ~ 7)(A: ~ s) < §~* k 5 {1 ^ ){k - s) 
contradicting the goodness of Q k . So we must have 

d{Q k a) X \ R k ~ s ) > d(x k , X \ R k ~ s ) - m k 

> [3/4 — \Q- A }§7 k $0--i)(. k - s ) > 2~ 1 S~ /k S ( - 1 ~" / * )( - k ~ s \ 

Thus also d(Q k , R k ~ s ) > 2~ 1 5' rk 5( 1 ~ 1 ^ k ~^ for every 7 ^ f3. We record these easy 
observations as a lemma. 



10 



HENRI MARTIKAINEN 



4.3. Lemma. If Q G T? zs geometrically V'-good, then for every R G X>' /or w/zz'c/z 

< 6 r £(R) we have either d(Q,R) > £(Q)i t{Rf-^ or d(Q , X\R) > £{QY<i{R) l -\ 

If Q k Y is bad, then the definition demands that for some s > r we have that 

x k G R k ~ s G V so that d(a;*, X \ R k - S ) < 57^(1-7)^-^) = 57^*-- = 5^ s £(R k - s ). 
Lemma 14711 with e = 5 1S then yields that 

00 

P(Q* is geometrically D'-bad) < ^(cP 7 ) 5 < 5 nv . 

s=r 

We have proved the following. 

4.4. Lemma. For a fixed Q eD we have under the random choice of the V'-grid that 

is geometrically V -bad) < S nr> '. 



We still need to achieve the effect that F(Q is good) would not depend on the 
particular choice of the cube Q (in IR n this followed from symmetry, see | Hyt09b| ). 



There seems to be no obvious reason why this should be the case already, so we 
will force this by understanding goodness in a stronger sense: a cube is good if it 
is geometrically good and pseudogood - a notion to be defined. 

Define n x k = is geometrically D'-good). Note that under the random 

choice of the other grid V, this really depends only on the center x k of Q k a . Set 
TTgood = 1 — C5 rjv so that always n x k > 7r good . Set = 1, if < t k < 

TTgood/M' and Z ( tk a' xk a) = # if 1 > t* > n good /7l x k. NOW P(^(t*, X k ) = l\x*) = 

"""good/TTx* using the Lebesgue measure on the interval [0, 1]. We say that Q k a is 
pseudogood if Z(t k Q ,x k a ) = 1, and D'-good if it is geometrically D'-good and pseu- 
dogood. If one considers the grid V to be fixed, then under the random choice of 
the pseudogoodness parameters and the grid V, we have by independence that 
P(<5a is D'-good) = 7r good for every Q k G V. We use analogous random variables 
W(u k a , y k ) to determine the pseudogoodness status of a cube in the grid V, and 
then the D-goodness is also similarly defined. 

Basically all these modification were done to prove the following analogue of 
|Hyt09b , Lemma 5.2] with our randomized systems of metric dyadic cubes. This 
enables us to later establish that a certain paraproduct is bounded following the 



strategy used in |Hyt09b|. 



First a few comments. In the following chapter we shall introduce two fixed 
functions / and g, and their martingale difference decompositions using Haar 
functions. The aim is then to control a certain average (|5.1|) . The details of this 
are not important for the next lemma, except for the fact that looking at that 
particular sum one sees that it is enough to sum over some fixed finite index set 
(k, a) (because the functions have bounded support, the space is geometrically 
doubling, and cubes of only finitely many generations are needed). Thus, we 
assume that such is the case in the next lemma also. This enables us to move E in 
and out the summation freely (see the proof). Also, cp(Q, R) is an L 1 -function of 
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cubes Q and R and their children - basically in the only application of this lemma 
we take <p(Q, R) = (g,$R)(fo,T(b x ip Q ))(il)R)Q(ipQ, /) (see the chapters 5 and 8). 

4.5. Lemma. We have that 

(l-C5 r ^)Ej2 E riQ-H) E E E V(Q,R), 

R&W Q£V gwd RZV'good Q^good 

8 k o <e(Q)<e(R) ' s k o<e(Q)<e(R) 

where the grid V is fixed (so a set of points (y k ) is fixed) and we average over every other 
random quantity ((x k ), (t k a ), (u k )). 

Proof. We start by recalling the dependencies (remember that the points (y™) are 
fixed). The goodness of a cube R" 1 E V depends on the points x k a for which 
k < m and on u™. The goodness of a cube Q k a eV depends on x k a and t k a . As sets, 
Q k and its children depend on the centers x^ for which I > k (and this is because 
of the restriction 5 k = Z(Q k a ) > 5 k ° which says k < k ). 

Note that 7r good = F(R G £> good ) = E(x g00 d{R)) for every R e V. Thus, we have 

TTgoodE E E <p(QiR) = ^goodiE E E XgoodiQiMQl R™) 

RaW QGDgood (m, 7 ) (k,a) 

8 k o <£{Q)<£(R) m<k<k 

= E E ^(xsUR^mxsUQiMQi,^)) 

(m,7) (k,a) 
m<k<ko 

= E E ^x s UK)x s °°*(QaMQlK)) 

(m,j) (k,a) 

m<k<ko 

= e e E <p(Q> R )- 

5 fc o <e(Q)<e(R) 

Let us still spell out the details of the above computation (since it is actually 
surprisingly subtle and depends on all of the modifications made above). We 
first removed everything that is random from the summations. Then we moved 
the expectation inside the summation (the sum is finite by assumption), and after 
that we also moved the constant 7r g00 d = 1 — Cb r ^ inside the summation noting 
then that it equals E(x g00 d {R™) ) with any (m, 7) . Next we used the product rule of 
expectations of independent quantities: the random variable Xgood (R%) depends 
on x^ for £ < m and on u™ , and the random variable Xgood(Qa)y?(Qa, -^™) depends 
on x^ for I > k >m and on t k a . Recall also that the points of different generations 
are independently chosen. Finally we moved the expectation out and rewrote the 
summation so that it again contains the random quantities. □ 
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5. Martingale difference decomposition, Haar functions and the 

tangent martingale trick 

Let us be given some system of cubes {Q k a } and some accretive function b. We 
set 

a 

EqkJ = Xq*E b k f, 
A b k f = E b k+1 f - E b k f, 

Consider some cube Q. It has subcubes of the next generation Q ir i — 1, . . . , s(Q), 

where s(Q) < 1. We set Q k = Ui=T Qi> an< ^ no ^ e * na * we can always arrange 
the indexation of the subcubes to be such that \b(Q k )\ > fi(Q) for every k = 
1, . . . , s(Q). Indeed, we can index so that (here a is the accretivity constant of b) 

\KQk)\> (l-^)o/*(Q)>/*(Q), 

and this can proven as | Hyt09b| Lemma 4.3]. Note also that trivially \b(Q k )\ < 

n{Q) (so |6(Q fc )| ~ //(Q)) and |&(Q<)I ~ M<2*)- 
Now define 

^q, «/ = e qJ + Eb Q u+ J ~ Eb dJ 

also noting that 

s(Q)-l 

a q/ = E a qJ- 

u=l 

A computation shows that 

A b Q J = bcp b Q J^ u J), 
where we have the adapted Haar functions 

6 ( b{Q u )b{Qu+i) \ l ' 2 ( _XQy_ _ Xq* 
^ V h(6.\ ) \b(Q„) h(6.. 



KQ U ) J y KQu) b(Q u+1 ] 

as in [ Hyt09b[ . Here we have to interpret (p b Q u = if n(Q u ) = 0. We also have the 
non-cancellative adapted Haar function 

<&flf = KQ)- 1/2 xq 

using which we write E b Q f = b(p b Qfi (<p b Qt0 , f). 

We record the key properties (the last two being only important special cases) 
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and 

\W h Q,u\\L^X)\W h Q J\L~>{X) ~ 1. 

Given a dyadic system V = {Q} we can write with any m that 

/= E A Qf+ E 4 1 / 

Qe£> Qex> 

£(Q)<<5™ £(Q)=<5 m 

= E E 6 ^>S.«>/>> 

QeV u 
e(Q)<s m 

where the u summation runs through 1, . . . , s(Q) — 1 if t(Q) < 5 rn , and through 
0,1,..., s(Q) — 1 if £(Q) = 5 m . The unconditional convergence of this in L P (X, Y) 
is not at all clear, but it nevertheless follows as in |Hyt09b Proposition 4.1] (note 
that in that proof certain abstract paraproducts are used, but their theory is for- 
mulated in chapter 3 of |Hyt09b| in an abstract filtered space which directly ap- 
plies also in our situation). 

Basically the strategy we shall use is the usual one: write the same decompo- 
sition for a function g 6 L p> (X, Y*) just using some other grid V = {R} and the 
other test function b 2 , and then decompose the pairing (g, Tf) accordingly. How- 
ever, Lemma 14.51 has the restriction involving k (which we have not yet fixed) 
and so we somehow need to get into a situation where we do not need to con- 
sider arbitrarily small cubes. 

We start by choosing two boundedly supported functions / G L P (X, Y) and 
g E L P '(X,Y*) so that f/bi and g/b 2 are Lipschitz, \\f\\ L p{x,Y) = \\g\\ L p'(x,Y*) = 1 
and ||T|| < 2\(g,Tf)\. Here, of course, ||T|| = \\T\\ lp {x ,y)^lp( x,y)- For the fact that 
Lipschitz functions are dense, see e.g. the proof of | Hyt09a[ Proposition 3.4]. We 
now also fix m so that the supports of the functions / and g are contained in some 
balls B(x , S m ) and B(xi, 5 m ) respectively. 

Using any two dyadic systems V and V we decompose 

(9,Tf) = (g- E%g,Tf) + (E b k lg,T(f - E%f)) + «^,T(^/)), 
and then estimate 

j&2 



\(g,Tf)\ < WTWWg-E^gW^^ \\f\\ LP{ x,Y) 

+ \\ T \\\\E b k lg\\ L p' {X ,Y*)\\f - E k f\\LP{X,Y) + W^koVi 1 K^ho. 

Note that \\E^ o g\\ Lpf(xxr) < H^l^^y.) = 1 so that we get 

\(g,Tf)\ < (C(b 2 )\\f - E b k lf\\ LP(XtY) + \\g- E%g\\ 

Lp'(X,Y*) 

)\\T\\ + \(E b k lg,T(E b k lf))\. 
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Next we employ the facts that f /b x and g/b 2 are Lipschitz (with a constant L 
say). Let h = f/h. Let rr G X and then let Q denote the unique "D-cube of 
generation k containing x. We have that 

iKfW-fWlWZUblhhW-ihh^lly 

<-^J\bl(z)\\\h(z)-h(x)\\ydlJL(z) 

<Ld(Q)<LS k ». 

Noting that \J{Q : Q eV ko ,Qn B(x , 5 m ) ^ 0} c S(x , 25 m ) we have that 

||/-^/IU,(x,y)<^A(xo,n 1/ ^°- 

A similar estimate holds for \\g — E b ko g\\ Lv t, XY *y We fix k to be so large that we 
have 

||T||/2 < \(g,Tf)\ < \\T\\/4+\(E b k lg,T(E b k lf))\, 

that is, ||T|| < A\(E^ o g,T(E b k lf))\ with any grids V and V (but only with these 
particular fixed functions / and g, of course). 

Now we write (E%g, T(E b k J)) as the following sum 



E *%9+ E E %9,T(E b k lf) 

e(R)=8™ 



5 k o<e(R)<8 m 

+ ( E E 4 2 .,t( e 



«e» g ood 



QGX> bad 

s k o<e(Q)<s m 



£ 4'/) 



<96© bad 
^(Q)=<5™ 



S k o<£(R)<5 m e(R)=8 m 

+ E E ' a) fat* .rib, ^ j ) (<^y>, 

QeD good , i?e^ 00d u,t> 

<5 fc O<£(Q),£(i?)<5™ 

where the u summation runs through 1, . . . , s(Q) — 1 if £(Q) < 5 m , and through 
0, 1, ... , s(Q) — 1 if £(Q) = 5 m , and similarly for the v summation. We thus have 
that ||7 1 ||/4 is bounded by the sum of the following terms 



IITI 



E a&+ e 4 2 < 



Rev 



bad 



Rev 



\E b k J\\ L p{x,Y), 



bad 



5 k o<E(R)<5^ 



l(R)=5 m 



wn E E e r9 



^e^good 
5 fc o<£(R)<<5" 



^e^good 



LP > (x,y*) 



E A S/+ E E Qf 



QGD bad 

8 k o<e(Q)<5» 



Qe© bad 

£(Q)=<5™ 



LP(X,Y) 
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and 



8 k o<£(Q),£(R)<S" 
bi 



Note that clearly \\E^f\\ LP{x>Y) 
II E E nV 



< 



^good 



lp(x,y) = 1 and 



1. 



LP > (X,Y*) 
£(R)=S m 

Also, using unconditionality and the contraction principle, we have that 



E 4fc 



^good 

S k o <£(i?)<5" 



< 



\9\\lp'(x,y*) 



1. 



Thus, the terms involving bad cubes are dominated by 



\T\ 



E e ^ 



R^bad 

8 k o<£(R)<5" 



£(R)=5™ 



LP' (X,Y*) 



+ 



E A S/+ E 4 1 / 

QeT> had QeT> had 

5 k Q<£(Q)<5 m £{Q)=S m 



Lp(X,Y) 



Taking expectations over all the random quantities in the randomization of cubes, 
it is easy to see that 



E || E E R9 



REV' 



+e\\ y EqJ 

Lp'(X,Y*) II z — ' 



QGl^bad 

£(Q)=5™ 



< 



LP(X,Y) 



r)(r) 



e(R)=s™ 

where r\ (r) — > when r — > oo. Working similarly as later in chapter 9 (when 
estimating a certain term i^) we have that 

8 k o<e(Q)<S m 



E 



E ^ 



LP 1 (X,Y*) 



+ E 



< 



tp(x,y) 



7](r) 



S k o <£(R)<5 m 

as well. The proof requires a certain improvement of the contraction principle 
which will also be recalled in chapter 9. One can consult | Hyt09b| chapter 12] 
too. 

Choosing r large enough we thus have that 



(5.1) 



ITU/8 < E 



E E^t> 9) (b*P% v , T{h<P%,J)(<P%,J) 
Qev good ,ReV good u,v 

5 k o<£(Q),£(R)<8™ 
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We almost always suppress the finite summation over u, v and after that is done, 
simply write tp Q = ip b ^ u , ip R = ip^ v and T RQ = (b 2 ipR,T(bi(p Q )). The summation 
condition 5 k ° < £(Q), £(R) < 5 m is always in force, and thus most of the time 
not explicitly written. The estimation of this series involving good cubes only is 
now split into multiple subseries to be considered in the subsequent chapters. We 
primarily deal with the part £{Q) < £(R) the other being symmetric. Although we 
have H/ 1| Lp(x,y) = IMIzypf.y*) = 1/ in some of the estimates below we explicitly 
write ||/||iP(j5f,y) and ||#||jy(x,y*) in place of 1 for clarity. 

We still comment on some of the techniques used on the following chapters. 
Related to this vector-valued L p -theory we combine basic randomization tricks 



with the more sophisticated tool called the tangent martingale trick in |Hyt09b[. 
Let us now formulate this since it is of fundamental importance to us. 

5.2. Proposition. Let A = [J k Ak, where A k is a countable partition of X into Borel 
sets of finite [i-measure, and a(Ak) C a(Ak+i)- For each A e Awe are given a function 
fA'X—>Y supported on A, and so that j\ is o(Ak+i)-measurable whenever A e At- 
For each A e Awe are also given a jointly measurable function kA ■ A x A — > C, which 
is pointwise bounded by 1. We have 



/ | X/ fc E (A) / k A{x,z)f A {z)d^(z) dF(e) d(j,(x) 

~ / || 5> E fA(x)\\ P dF(e)d^(x). 

This is the only version of the trick we explicitly need in this paper. For this re- 
sult and some more general theory related to this see |Hyt09b , chapter 6]. Lastly, 
we record the following randomization trick which is used multiple times in the 
sequel. For the proof see [Hyt09b, p. 10]. 



LP(flxX,Y) 



5.3. Lemma. Suppose that for each R e V we are given a subcollection V(R) c V. 
There holds 

| J2(9,^r) E T RQ&Q,f)\ 
Rev QeT>(R) 

\\Y1 e fc E ^ R E Trq(<-Pq-, f) 
k&z Rev' k QeV(R) 

where we have the measure P x fi on x X (here (fi, P) is just some probability space). 

6. Separated cubes 

We consider the part of the series where R G T^' good , Q €E V g00 d, £(Q) < £{R) and 
d(Q, R) > CC £(Q). Also the adapted Haar functions ipQ related to the smaller 
cubes Q are assumed to be cancellative. 

We begin with some estimates for the matrix elements Trq = fotpR, T{b\ipQ)) 
- these follow, with some modifications, [HM09, Lemma 6.1 and Lemma 6.2]. 
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6.1. Lemma. Let Q e V and R e V be such that £(Q) < £(R) and d(Q, R) > 
CC £(Q). Assume also that (p Q is cancellative. We have the estimate 

£{QY 

ITrq1 ~ d(Q, RY su PzeQ X(z, d(Q, Rj) H^ll^)- 
Proof. Recalling that J b±(fQ d\x = 0, we have for an arbitrary z e Q that 

T RQ = / / \K{x,y) - K(x,z)]b 1 (y)<p Q (y)b 2 (x)ip R (x)d(ji(y)dii,(x). 



The claim follows from the kernel estimates (which we may utilize since d(x, z) > 
d{Q, R) > CC £(Q) > Cd{y, z)). ' □ 

We set D(Q, R) = £{Q) + £{R) + d(Q, R). 

6.2. Lemma. Let Q e V good and R G V be such that £(Q) < £(R) and d(Q, R) > 
CC £(Q). Assume also that (p Q is cancellative. We have the estimate 

|Tfid ~ D(Q, RY su Pz6Q X(z, D(Q, Rj) l™* 1 ^ 

Proof. If £{Q) > 5 r £(R), then d(Q, R) > D(Q, R), and the claim follows from the 
previous lemma. In the case d(Q,R) > £(R), we also have d(Q,R) > D(Q,R), 
and the claim again follows from the previous lemma. 

We may thus assume that £(Q) < 5 r £(R) and d(Q, R) < £(R). As Q is good, we 
have d(Q, R) > £(QY£(R) 1 ~' y - Consider an arbitrary z e Q. Using the identity 

and the doubling property of A one gets that 

X(z,d(Q,R))> {^y td X(z,£(R)). 

The claim then follows from the previous lemma, the identity 7(i + 7a = a/2, and 
the fact that in our situation £(R) > D(Q,R). ' □ 

Let us then state and prove the main result of this section - this follows, save 
the technical modifications, [ Hyt09b| p. 25-26]. 

6.3. Proposition. There holds 

Yl (9,^r)Tbq((PqJ) 

t(Q)<£(R), d(Q,R)>CC £(Q) 

with the additional interpretation that the adapted Haar functions (p Q related to the 
smaller cubes Q are cancellative, even on the coarsest level £(Q) = 5 m . 



~ \\9\\ L p' {X,Y*)\\f\\LP{X,Y) 
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Proof. We first consider the case 

£(R) = 6 k , ke Z, 

£(Q) = S k+m , m = 0,1,2,..., 

6 k -i <D(Q,R)<6 k -i-\ j = 0,1,2,.... 

The last requirement says that D(Q, R)/£(R) ~ 5~ j . The estimate from the previ- 
ous lemma gives 



RQ I 



< 



LP(nxX,Y) 



\\<Pq\\iam\\1>r\\iaq*) ~ sup z6Q A(>, 5 fc -J) ' 

We suppress from our notation the requirement that d(Q, R) > CC £(Q). Lemma 
15.31 gives 

\Y Y Y (9,tPr)Trq{(PqJ) 

fcez ReCgood,*: QsCgood^+m 

D(Q,R)/£(R)~6-i 

^\\g\\LP'(x,Y*)\\J2 ek Y Y tPrTrq&qJ) 

keZ fl£fgood,fc Q6fgood,fc + m 

D(Q,R)/l(R)~8-i 

For a cube Q denote by Q e the unique cube of generation £ < gen(Q) for which 
Q C Qe. Let 9(j) denote the smallest integer for which 6{j) > (j'7 + r)(l — 7) _1 . 
Recalling that -R is good and r is large enough, we must have for any Q and i? in 
the above summation that R C Qk-j-eyy Thus, we may write 

E = E E ■ 

^ good , fc seD^m Rev' good k 

RCS 

Also, we have MS) < inW X(w, 5 k ~^ e ^) < S~ M ^ inf we5 X(w, 5 k ^). Define 
via the identity 

TRQ — j^7^\ || < ^Q|U 1 ( A t)||V'-R|U 1 ( / Lt)*«Q) 

and note that we have 

inf we5 X(w, 5 k ~ j ) 
sup 2eQ A(z,5 fc J) 

Also relevant is the estimate 

^aj-de(j) < ^[a-d7(l-7) _1 ]j _ ^(a 2 +arf)(a+2d)- 1 jr' 

For every 5" £ T>k-j-e(j) we set 
Ks{x,y)= Y i>R(x)\\^R\\L^)t R Q\\<pQ\\ L ^)VQ(y)bi(y)- 

ReV ' e oo d .k Qe^good, fe+m 

RCS D(Q,R)/e(R)~S-i 



VECTOR- VALUED NON-HOMOGENEOUS Tb THEOREM ON METRIC MEASURE SPACES 19 



As IkQlUi&olkolUooM ~ !/ II^IIli^II^IU 



'(aO 



<l,\\b 



< 1/ I*aqI < 1 



and for every fixed x and y there is at most one non-zero term in the double 
sum defining K s , we have \K s (x,y)\ < 1. Also, fT? is supported on S x 5 as 
spt ipn C R C S and spt C Q d S. 

Using the fact that fb^d/i = one notes that (</?q,/) = (v?q, A^ +m /) for 
Q G ©fc+m- Using this and the definitions from above, we see that 



E 6fc E E ^rTrq{<PqJ) 

,fc + m 

D{Q,R)/e(R)~6-i 



Lp(UxX,Y) 



Xs 



(Q , 1 Ks(;y) h~T\ d ^ y > 

»{S) Js 



LP(ftxX,Y) 



Due to the measurability requirements of the tangent martingale trick we further 
split up the above sum over k e Z into m + j + 9{j) + 1 < m + j + 1 subseries: 

m+j+e(j) 

E = E E • 

modm+j+6»(j)+l 



The point is that y i-> X6 y 6 ^ m y is constant on the subcubes of generation 



k + m + 1 = k' — j — 9(j), where k' = k + (m + j + 9{j) + 1). Applying the tangent 
martingale trick to each of these subseries then yields that 



EE E {9,*I>r)Trq{(PqJ) 

D(Q,R)/i(R)~5-i 

m+j+eU) A bi f 

, r« mr g + ad j u I, \ ^ \ ^ \ ^ AS^k+mJ 

$ a+2d3 \\g\\ LP '(x,Y*) E ||E 6fc E 

fc =0 fc=fco SeT> k _j_ g ^ 



LP(flxX,Y) 



where the last inequality follows from the unconditional convergence of the adapted 
martingale difference decomposition (after discarding l/&i). Summing over m, j = 
0, 1, 2, ... yields the claim. □ 



7. Cubes well inside another cube 



We consider the case R e T>' good , Q e V good , Q c R and £(Q) < 5 r £(R). As usual, 
there is a need to introduce some cancellation. To this end, here we consider the 
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modified matrix 



T RQ = T RQ - {b 2 ,T{bnp Q ))(ij) R )Q 

= -{Xx\sb2,T{bnp Q )){4> R ) s + (Xs>ipRb2,T(bi(p Q )), 



S'CR\S 
e(S')=5l(R) 



where S C R is such that £(S) = 5£(R) and Q C S. The point is that Q is separated 
from the rest of the subcubes S' and we have introduced cancellation for this one 
problematic subcube S. The correction terms form a paraproduct operator, the 
boundedness of which will be considered in the next chapter. 

We again begin with some estimates for the matrix T R q. Let us be brief as 
these estimates follow pretty much as in [HM09, p. 20-21]. Fix some z e Q. 
Recalling that for every ball B = B(cb,tb) and for every e > we have the 
estimate (integrate over dyadic blocks 2 j r B < d(x 1 c B ) < 2 j+1 r B or see [H M091 
Lemma 2.4]) 



we establish by changing K(x, y) to K(x, y)—K(x, z) (using J b^Q dfi = 0), using 
the kernel estimates and noting that X\S C X \ B(z, d(Q, X \ S)) that 



To see that it was legitimate to use the kernel estimates note that in the corre- 
sponding integral d(x,z) > d(X \S,Q) > (i(Q)H(S) 1 ^ > 6- r ^k(Q), so that 
d{x, z) > Cd(y, z) choosing r large enough. Furthermore, note that d(Q, X \ S) > 
£(<5) 7 £(5) 1 ~ 7 > £(Q) 1/2 £(R) 1/2 , and so continuing the above estimates we obtain 



For the other finitely many terms involving a subcube S' C R (where we have 
separation) we have using Lemma \62\ (or actually, a trivial modification) that 




\( X x\sb2,T(b 1(pQ ))\ <e(Qry Q h^)d(Q,X\S) 



—a 





where the last estimate follows after noting that 

H(R) < v(B(z,C £(R))) < X(z,C £(R)) = \{z,C^ l £{S')) < X(z,l(S')). 



Let us recapitulate all this as a lemma. 
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7.1. Lemma. If R G V, Q e V good , Q c R, £(Q) < 5 r £(R) and S is the subcube of 
Rfor which £(S) = 8£{R) and Q c S, we have 



IT, 



RQ 



~ \£(R)J 



\(1>r)s\ + 



A familiar strategy involving kernels and the tangent martingale trick shall 
now be employed (as in the previous chapter and as in |Hyt09b[). For this, the 
following lemma is both natural and useful. 

7.2. Lemma. If R G V, Q G V good/ Q c R, £(Q) < 5 r £(R) and S is the subcube of 
Rfor which £(S) = 5£(R) and Q c S, we have 



lfcW% 4 MI<(I)" f l^#^ 



X 



\£(R)J I n(R) n(S)y 

Proof Taking the previous lemma and the estimates II^qIIliq^H^qIIl 00 ^) ~ 1 an< ^ 
||'0-r||l 1 (a')II^IU oo (m) ~ 1 ^ n *° account it suffices to prove that 

This follows by recalling that ip R = tp^ v for some v, denoting S = R w/ subdivid- 
ing the estimation into cases (v = w and x G S), (v = w and x G R\S) and v ^ w, 
and finally recalling that one has 

(or \ip R \ ~ KR)~ 1/2 if v = and no subdivision into cases is necessary). 

We are now ready to prove the main result of this section. 
7.3. Proposition. It holds 



□ 



Y Y (9,^r)Trq(^qJ) 
ReV good Q£V gmd ,QcR 



< 



\9\\lp' {X,Y*)\\J \\Lv(X.Y) 



£(Q)<8 r e(R) 



Proof Let s(R) denote the number of subcubes of a cube R G V and set s = 
max ReT >' s(R) < 1. Fix w G {1, . . . , s} and m G {r + 1, r + 2, . . .}. The already used 
randomization trick gives 

\Y Y Y (9,*Pr}t rq {<pqJ} 



fcez Rev 



good.fe Q e:D good,fe+m 
QcR w 



>^*) \\Y ek Y Y ^rTrq(vqJ) 



fcez Rev' 



;ood,fc QS^good.fc + r 

QC-R 



Lp(UxX,Y) 
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We introduce the relevant kernels now. Indeed, set 

fc + m 

QcR w 

K i R = s -am/2 J- fj l (R w )xR w (x)i/j R (x)f R Q(p Q (y)b 1 (y). 

Q^^good,fc + m 

QCR 

The previous lemma yields at once that \K R (x,y)\ < 1 and \K R (x,y)\ < 1. Also, 
the supports lie in R x R and R w x R w respectively. There holds 

E efc E E ^r( x )t R q{vq, f) 

kez ReV good k Q&v g00d}k+m 

QCR W 

™^V< f „<., . A Xfi(»)At' +m /(») ■ , , 

S B J ijmJr — m — Mv) 

good,fc 

+ ^2> £ *5=M/ KUx,y) X ^ V) h ^Y (V: W 

The tangent martingale trick cannot quite yet be used - the measurability con- 
ditions need not hold (note the important difference with the argument of the 
previous section - there we did not have the dyadic systems V and V mixed in 
the way we have here). To fix this, one simply defines new partitions 

T h = {S n Q ^ : S e V' k , Q e P fc _ r _i} 

and exploits the goodness of the cubes R via the observations 

£good,fc c F k and {R w e V' k+1 : R w C R e V goodk } C Tk+i- 

We then extend the above sums to be over the sets T k and Ft+i respectively 
by using zero kernels for all the new sets R. We may then apply the tangent 
martingale trick after passing to the obvious subseries over k yielding, just like 
in the previous section, the bound 

EE E Mr)Trq(vqJ) <5 am ' 2 {m+r+l)\\9\\ 

kCZ RCV' gQ0d k QGCgood,fe + m 

QdR-w 

from which the claim follows after summing over m — r + l,r + 2, . . . and w = 

l,...,s. □ 
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8. The correction term and the relevant paraproduct 

Recall that we subtracted (b 2 , T(biip Q )){^ R ) Q from T RQ in the case R e T>' good , 
Q ^ ^good/ Q C R and £(Q) < 5 r £(R). Thus, we now need to consider the sum 

( 8 -V E E (gM{b2,T{b lVQ )){ij R ) Q { VQ J). 

i(Q)<s r i(R) 

Recall also that we always have the suppressed summation over u, v and the 
restriction that 5 k ° < £(Q), £(R) < 5 m . Writing out the above sum unhiding these 
conventions and then recalling that e.g. Agf = ^ u &i¥>q,«(<£>q, U) f), we see that 
(writing explicitly only the relevant restrictions) 

§3D= E ( E (a^/6 2 ) q 

Q6C g00d ReV good ,RDQ 
i(Q)>8 k o S- r e(Q)<£(R)<8 m 



+ E {E b £g/b 2 ) Q ){T%,^f). 

S- r e(Q)<l(R)=5™ 



Now we use the trick from |Hyt09b[ noting that the inner summation would 
collapse to (E b R 2 g/b 2 )Q = {g) R J (b2) R , where R e V is the unique cube of genera- 
tion gen(<5) — r for which Q c R, were it not for the restriction to good "D'-cubes in 
the summation. Now it is clear why Lemma l4~5l was worth proving. Indeed, we 
may achieve this effect just by considering the grid V being fixed and averaging 
over all the other random quantities used in the randomization of cubes. We use 
Lemma |4~5l twice. First, to remove the restriction to good R, and after collapsing 
the series, to put the restriction back. This yields 

(9)i 



EiH)=E E -Ky( T * b ^ A Qf) 

Qev soodRe v' good ,RDQ [ 2)R 

e( R )=6-r£(Q) 

Rtv> sood Qev sooA ,QcR { 2)R 

t(Q) = 6 r l(R) 

where the standard summation conditions were yet again suppressed. 

Notice now that the right hand side of this is the expectation of a pairing 
(lip, /), where we have (for every fixed choice of the random quantities) the para- 
product 

n*= E E fy(T*b 2 ,b WQ )VQ- 
Rtv' good Qev sood ,QcR \ 21 R 

£(Q)=5 r i(R) 
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We shall next study this with any fixed choice of the random quantities. Note 
that in [HM09] the paraproduct had the inessential difference that instead of the 
requirement of Q being good we had the requirement d(Q,X \ R) > CC £(Q) 
(which follows from the goodness), and the essential difference that the bigger 
cubes were not restricted to good cubes. As was noted in |Hyt09b], this restriction 
is useful in this vector valued context. 



8.2. Lemma. If '(p G BMO p K (fi), then 



Q£T> sood , QcR 
e(Q)<6 r t(R) 



< 



fi{R) 1/p \\(f\\ BMO P M . 



LP(ClxX) 



Proof. This can be proven similarly as jHM09l Lemma 7.1] borrowing some mi- 
nor additional ingredients related to this vector valued context from the proof of 
|Hyt09b[ Lemma 9.3]. □ 



Since T*b 2 G RBMO(^) c BMO£(//) for any 1 < p < oo (see the relevant 
chapter of the present work), the previous lemma is important in proving that the 
paraproduct II is bounded. We will not provide the exact details instead citing 
]Hyt09b[ as this part of the argument no longer has anything special to do with 
the metric space structure or with our use of more general measures. Indeed, 
having been able to do all these reductions in the metric space setting, one can 



now follow the argument found in |Hyt09b, p. 32-33] pretty much word to word 



(when reading that, notice that the chapter 3 of |Hyt09bJ is already in a abstact 
form suitable for us), and this yields: 

8.3. Proposition. We have 

\\^9\\lp'(x,y*) ~ \\T*b2\\ BMO pj ^WgWipUxx*) ~ IMIjy'Cx.y*)- 
The main result of this chapter now readily follows. 

8.4. Proposition. We have 

| E Yl Yl (9,^R)Q>2,T(b 1 <PQ)){^R)Q(<PQj) <\\g\\LP'(X,Y*)\\f\\LP(X,Y), 

R<£D' gooi QeV g00d ,QcR 
l{Q)<5n{R) 

where we average over all the random quantities used in the randomization of the cubes. 



9. Estimates for adjacent cubes of comparable size 

We shall now deal with the part of the series where good cubes Q G V g00 d and 
R G fg 0od are adjacent (d(Q,R) < CC mm(£(Q),£(R))) and of comparable size 
(|gen(Q) - gen(i?) | < r). We denote the last condition by £(Q) ~ £{R). Also, only 
the size, and not the cancellation, properties of the adapted Haar functions are 
used. 
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We are given some fixed small e > 0. Given cubes Q and R define A = Q n R, 

S Q = {x : d(x,Q) < e£(Q) andd{x,X \ Q) < e£{Q)} and S R = {x : < 
e£(i?) and d(x, X\R)< e£{R)}. Also, set 

Q b = Qn (J 8 R , 

R>£V':e(R')~e(Q) 

and 

R b = Rn |J <V- 

Set also Q s = Q \ A \ Q 9 = Q \ A \ Q s/ R s = R \ A \ 5 Q and i? a = R \ A \ i? s . 
Furthermore, we still define that A = A \ 5q \ Sr. 

Given R e £g 00 d/ there are only finitely many Q e ©good which are adjacent to 
i? and of comparable size. Thus, one needs only to study finitely many subseries 

(9,^r)Trq((PqJ), 

where Q = Q(R) is implicitly a function of R - a convention that is used through- 
out this section. We shall also act like the mapping R Q(-R) is invertible - this 
only amounts to identifying some terms with zero (if there are no preimages) or 
splitting into finitely many new subseries using the triangle inequality (if there 
are multiple preimages). 

Recall that T RQ = (^ R b 2 , T{b lVQ )). We note that 

Q'eV-.Q'cQ Q'eV-.Q'cQ 
e(Q')=8£(Q) e(Q')=5e(Q) 

where A Q > = (<Pq)q>(<Pq, /)• Similarly there holds 

hMg, ^r) = Y hXR'B-R', 

R'€V':R'cR 

e(R')=se(R) 

(iPr)r'{9, ^r)- Thus, we are left with finitely many new subseries of 

Y B R (xRb 2 ,T(b lX Q))A Q , 
Rev 

where Q = Q{R) is a new function of R but one still has £(Q) ~ £{R). Note also 
that the parents of these cubes are always good. 

Given R and then Q = Q(R) as in the above sum, we shall now split the pair- 
ing (xRb2,T(b 1 XQ)) into several terms. First, we use that given v G (0, 1) there 
exists an almost-covering B of A by separated balls in the sense that we have the 



where B R > = 
the form 
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following properties: 

MMIW B) < vfi(A), 
KB c A for every B G B, 
d(B, B') > v max(r B , r B ,) if B,B>eB, B^ B>, 
#B<C(e,v). 

For the details of the probabilistic construction of B, see chapters 8 and 9 of 
[HM09J. We write A \ [j B as a disjoint union of fij = A \ (J S and some sets 
fi Q C Q b and c R b . 
We now decompose 

{XRKTihXo)) = {XR a K T {hXo)) + (XR s b2,T(bxXQ)) 

+ (XA&2, T(bxXQ B )) + (Xa&2, r(6ixgj) 

+ (XA\{jBb 2 ,T(b 1 XA)} + (X(jBb2,T(b 1 XA\(jB)} 

+ (xi)Bb2, T(b lXUB )) = A + B + C + D + E + F + G. 

Furhermore, we decompose 

E = (xa\ub & 2,T(6iXa)) 

= (Xn Q b 2 ,T{b 1 XA)} + (xn R b 2 , T(6iXa)> + (*n ( &2> T(6iXa)> 
= i?i + E 2 + E 3 

and 

^ = (xUB fo 2,T(6iXA\Ui?)) 

= (xub & 2,T(6iXq q )) + (X[jBb2,T(b 1 xn R )} + (xy}Bb 2 ,T(b 1 x^)) 
= F\ + F 2 + -F3. 

We still write 

G = (xub^2,T(6iXub)) 

= J2^b2,T(b lXB )) + ]T (x B '6 2 ,T(6 iX b)) =G t + G 2 . 

B B+B> 

It is time to deal with these terms now. These belong to various different 
groups: we have the terms with separation B, D and G 2 , the terms C, E\ and F\ 
involving the bad boundary region Q b , the terms A, E 2 and F 2 involving the bad 
boundary region R b , the terms E a and F 3 involving (and thus v), and, finally, 
the term G\ which shall be dealt with using the weak boundedness property 
Also, when we sum over R we have to use different kinds of strategies involving 
simple randomization, the tangent martingale trick and a certain improvement 
of the contraction principle. In some cases control is gained only after using the a 
priori boundedness of T, and in these cases it is essential to get a small constant 
in front so that these may later be absorbed. In addition, the terms with the bad 
boundary regions require that we average over all the dyadic grids too. 
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Let us now do all this carefully. Using the weakboundedness property holding 
for balls and the facts that AB c A for every B e B and #i3 < C(e, v), we 
obtain that G x = a^fx(A), where |q;a| ^ C(e,v). Using randomization, Holder's 
inequality and the contraction principle, we obtain (denoting the dyadic parent 
of Q by Q and similarly for R) that 

\J2b r g 1 (r)a q \ 

R 

Jn J x R q 

£ C( e ^)lkllLp'(x,y*)ll/IUf(x,y)- 

We then deal with the terms for which the summation over R can be handled 
using this same simple randomization trick (the estimates for the correspond- 
ing parts of the matrix element are, of course, different). One of these terms 
is G 2 . We obtain using the first kernel estimate, the doubling property of A, 
the separation of the different balls B and B' and the fact that B,B' c A that 
| C 2 1 < C(e, v)fx(A). Also, we have using the a priori boundedness of T and the 
fact that < vfi(A) that \E 3 \ < vW\\T\\n(A) and |F 3 | < v^T^A). Using 
the above randomization estimate then readily yields that 

\Y,BrG 2 {R)A q <C{e,v)\\g\\ 

R 

and 

R 

We now deal with the rest of the terms having separation (we already dealt 
with G 2 ). Namely, let us estimate the terms B and D. However, these are so 
similar that we only explicitly handle B here. The first kernel estimate yields 

Then we note that X(y,d(x,y)) > \(y,d(x,Q)) > \(y,e£(Q)) > e d \(y,£(Q)). Thus, 
we may write 

fi(Q)fi(R) 



mi yeQ X(yJ(Q)) : 
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where \/3q\ < e d (note that the infimum may be zero only if n(Q) = 0). Now we 
may write 

-Ete'»«>l»«lfW| nt ^^, ( ( <?)) l^lw»»A 

where |/3q| < |/3q| < e _a! . Recall that these parents R and Q are again good cubes. 
Also recall that every cube has at most < 1 children. So it remains to study the 
series 

E^^)ii^iUK,) infye J^ £(g)) ii^iu 1( ,)(^,/), 



R 



Lp(QxX,Y) 



where again \oq\ < e~ d (note that \(y, i(Q)) < \(y, £{Q)))- Using a randomization 
trick and then reindexing the summation we see that this may be dominated by 
||si| L p'pf,y*) multiplied with 

|| 5> E E w*\Mr(x) ini a x Q { \ m) \\^\w { ^ Q j) 

QcS 

Since R is good, £(R) < 5~ r l(Q) = 5 r £(S) and CC £(R) > d(Q,R), one easily 
checks that R C 5 (if r is large enough). We then set for S eV k that 

^s0£,2/)=e d E \\^r\\lh^r{x)— \ ( p(n ^ PQ\\<PQ\W{»)<PQ{y)bi{y), 

Vfc^good,fe + 2r 

QcS 1 

and note that the previous majorant can now be written in the form 

-d\\sr sr f is ( Afc +2r f(y) , , 

which is amenable to the tangent martingale trick as is next demonstrated. In- 
deed, just note that K s is supported on S x 5 and that |ifs(£, ?/) | < 1 holds, and 
then divide the summation over k into < 1 appropriate pieces to get that 



LP(QxX,Y) 



J2b r b(r)a q 



r 



~ e d \\9\\Lr>'(X,Y*)\\f\\LP{X,Y)- 



The same, as already stated earlier, works with B replaced by D. 

It still remains to deal with the terms involving bad boundary regions. The 
small term in front of ||T|| is gained only after averaging over the dyadic grids 
V and V . Somewhat tediously we have six {A, C, E\, E 2 , F\ and F 2 ) kind of 
similar terms to deal with. We only deal with the term E\ = (xn Q b2,T(biXA)) 
- we chose this term as it shares the additional (albeit small) difficulty with the 
term F 2 (not present in the four other cases) that the bad boundary region part is 
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in some sense in the unnatural slot (here VIq C Qt, is in the slot with b 2 ). What is 
useful here is that everything is inside A anyway. 

We turn to the details. Using randomization, Holder's inequality and the a 
priori boundedness of T one gets 



J2BrE 1 (R)A q < \\T\\\\j2eRB RX n Q(R) h 



R 



R 



Lp'(nxX,Y*) I 



^2e Q A Q b!XA 



LP(flxX,Y) 



Now the second term is easily seen to be dominated by ||/||jy(x,y) using the con- 
traction principle and unconditionality. 

The first term is more involved since it is here that the small factor needs to be 
extracted. Let us define 

k+2r 

U U«» 

j=k-2r R£T>'. 



S(k) 



Note that if gen(_R) = k, then gen(Q(R)) e [k — r,k + r], and so we must have 
XQp (R) = Xn Q(R) Xs(k)XR (recall that Q Q{R) C A C R). Throwing Xn Q(R) and b 2 away 
using the contraction principle, we get 



, ~ 7 ^ e kXs k) > BrXr 

keZ R£V[ 



Lp'(nxX,Y*) 



Now, keeping everything else fixed, we take the conditional expectation of this 
over the grids V . Using Jensen's inequality and Fubini's theorem, we get 



e y2 e kxs(k) y] b r xr\\ 

II ^-^ \\LP'(nxX,Y*) 

fcez Rtv k y ' 

< ( ! E\\j2 e kXs(k)(x) B RXr( 



lp'(q,y*) 



W 



In order to gain access to a certain improvement of the contraction principle (to 
be formulated shortly), it is still beneficial to further dominate this by 



x 



E || ^Z e kXs(k)(x) ^2 b RXr{^ 



Lp'(n,Y*) 



p'/t \ i/pf 
djj,(x) 



where t > p' . We now fix t once and for all demanding only that it is larger than 
p, p' , the cotype of Y and the cotype of Y* (recall that the dual of a UMD space is 
UMD and that a UMD space has nontrivial cotype). The requirements involving 
p and the cotype of Y are only needed when handling some of the other similar 
terms. 

We now formulate the contraction principle we need (this is [HV09, Lemma 
3.1]). 
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9.1. Proposition. Suppose Z is a Banach space ofcotype s e [2, oo), £j e Z, s < u < oo 
and 6>j e (here is just some probability space). Then 



3=1 



< SUP r^/m 



L 2 (f2,Z) 



Utilizing the above contraction principle together with Lemma 14.11 and Ka- 
hane's inequality gives (here the L* norm is taken over the probability space used 
in the randomization of V) 



R 



< 



LP 1 (QxX,Y*) 



E \\ J2 e R B RXn Q(R) b 2 

( / SU P llX^fe)^) II L* 

22 e R B RXR 



LP'(Cl,Y*) 



dfi(x] 



W 



R 



Lp'(VIxX,Y*) 



~ eV ^\\9\\Lp'(X,Y*)- 



We now formulate the above considerations as a proposition. 
9.2. Proposition. Let e > and v e (0, 1). We have the estimate 



E 



(9,1>R)TRQ(<pQ,f) 

QeV good : t(Q)~t(R) 
d(Q,R)<CC min(e(Q),£(R)) 

£ ^(e,^)lkll LP '(x,y*)lli ||LP(x,y) 

where we average over all the random quantities used in the randomization of the cubes, 
and c(e, v) can be made arbitrarily small by choosing e and v small enough. 

9.3. Remark. Recall that when we dealt with the separated cubes in Proposition 
I6.3| we had the assumption that the adapted Haar functions related to the smaller 
cubes are cancellative. Note that there are only boundedly many terms with 
£(Q) = £(R) = S m where the contrary can happen (due to the assumptions about 
the supports of the functions / and g). Thus, the relevant arguments involving 
separated sets used in the present chapter let us also remove this assumption. 

10. Completion of the proof 
Combining all that we have done in the previous sections shows that 

E E Y^&tvM^tv,nh<P%MAJ)\ < C(e,v) + c(e,v)\\T\\, 

Qev good ,ReV sood u,v 
6 k o<£(Q),e(R)<6™ 
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where c(e, v) — V when e — > and v — >■ 0. Recalling (|5.1[) the estimate ||T|| < 1 
follows by taking e and t> small enough. We have proved what we set out to 
prove, namely Theorem 12.91 
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